Multi-state echo suppressor

ABSTRACT

An echo suppressor uses a multi-state operating sequence to provide non-disruptive monitoring of a remotely broadcast program during a performer&#39;s live performance at a local site. The operating sequence is actuated by a voice activated switch that initiates signal processing for correlating a locally originating audio (e.g., the performer&#39;s voice) to its echo in the broadcast program. During correlation states in the operating sequence, the echo suppressor switches monitoring to a local microphone signal to allow the performer to hear his or her own voice while preventing echo from being heard before correlation is achieved and echo suppression can begin. The correlation states include an initial correlation at a reduced sampling rate to detect echo within a wide delay time window and determine the locally originating audio is on-the-air. A subsequent state then performs a second correlation at a higher sampling rate to correlate to the echo in a narrower delay time window displaced according to the approximate delay determined in the initial correlation.

FIELD OF THE INVENTION

[0001] The present invention relates to suppressing (or removing)delayed audio feedback effects (also known as echo) from a livebroadcast or other transmission.

BACKGROUND AND SUMMARY OF THE INVENTION

[0002] In many commonly occurring live broadcast scenarios, one or moreaudio signals originate from sources that are located geographicallydistant from a broadcast studio, and combined with audio from otherdistant or in-studio sources to form the broadcast program. For example,television news programs often include segments where a field reporterprovides live coverage from the local scene of a newsworthy event.Sometimes, the field reporter's commentary is interrupted orinterspersed with questions from an in-studio anchor. As anotherexample, many television and radio talk shows feature live debatesbetween a host and various experts who are electronically conferencedfrom multiple separate and geographically remote studios. The term“broadcast” as used herein refers to both through-air or wirelesstransmissions and to transmissions distributed over cable and otheron-wire communication networks.

[0003] In these scenarios, it is desirable and even necessary that thelocal “performer” monitors the actual program being broadcast from theremote studio to receive his or her “cue” to begin speaking, and to hearother parties speak during the program. However, a time delay isintroduced as the audio signal of the performer's voice is transmittedto the remote studio (e.g., on a land line, radio, microwave orsatellite path) for mixing into the broadcast program, and a furthertime delay until the broadcast program transmission arrives back at theperformer's site. This time delay is due to the time for the signal totravel along the communications path, as well as delays introduced byvarious electronics equipment in the path (more particularly, framesynchronizers, digital compressors, and other equipment). This delayproduces an echo effect that can be very disconcerting and disruptive toperformers (i.e., similar to the effect experienced by a singer in alarge stadium), such that the performers may find it difficult (if notimpossible) to speak while monitoring the broadcast program and areforced to remove or shut off their earphone to continue their liveperformance.

[0004] Echo also is a problem in other applications, such as distancelearning and telephone conferencing. In some distance learningapplications for example, students may attend a lecture transmitted tomultiple locales. Often, the audio of the lecture is a mixture of notonly the lecturer's microphone, but also of microphones at each of thelocales. This allows the students at each locale to freely posequestions, and also hear questions posed by the students at the otherlocales. When the various microphone inputs are mixed at a centrallocation (e.g., typically the lecturer's site), the students will hear adelayed echo of their own voice in the lecture's audio while posingtheir questions due to the transmission and other equipment delays.Telephone conferencing among multiple locations experiences a similarecho problem.

[0005] One prior solution to the echo problem in live broadcasts is totransmit a “mix-minus” signal from the remote studio to each local sitefor monitoring by the performer. At the studio, the audio signal of theperformer's voice is mixed in with other audio inputs from other sourcesto form the broadcast program. An inverse of the performer's audiosignal also is mixed with the broadcast program at the studio to formthe mix-minus signal, which effectively cancels the performer's voice sothat the mix-minus signal contains only the contributions of all theother audio inputs except that of the performer. The performer can thenmonitor the mix-minus signal without experiencing disconcerting echoeffects. An example of a broadcast system using such mix-minus signalsis disclosed in Davis, “Mix-Minus Monitor System,” U.S. Pat. No.5,454,041 (1995).

[0006] A drawback to the mix-minus approach is that an additionalseparate transmission path for each local performer is needed totransmit their respective mix-minus signal from the studio to theirrespective local site. The added communications links to the local sitescan add significantly to the costs of producing the live broadcastprogram. Further, the additional signals and communications links to thelocal sites add considerable complexity to setting up and runningproduction of a live broadcast program, and can increase the chance fortechnical errors during the program's production.

[0007] Various echo suppression techniques also are known and commonlyused in telephone communications, particularly with conference orspeaker telephones. In a typical telephone conversation, it is expectedthat the audio content in each direction is different. Also, it isexpected that if any transmit-to-receive leakage (i.e., the “echo”)exists, then the level of the echo will be substantially less than thelevel of the original audio (typically about 15 dB of attenuation) andhas a minimal delay (less than 250 milliseconds). Echo suppressiondevices that have been used for such telephones generally have relied onthese two conditions being present. These conditions, however, do nothold true in the above-described live broadcast, distance learning andconferencing situations. Specifically, the level of the performer'svoice in the broadcast program typically is equal to or even greaterthan that from the performer's microphone, and the delay often isgreater than 250 milliseconds (due, for example, to the use of framesynchronizers and digital compression). The signal sent from theperformer's site to the remote studio and the program broadcast from thestudio often have similar content, particularly when the performer isspeaking.

[0008] A further drawback to typical echo suppression techniques is thespeed at which the local microphone input can be correlated to echo inthe return audio signal, particularly when the delay is unknown over alonger time interval (e.g., greater than 250 milliseconds).

[0009] The present invention provides the capability to suppress orremove echo of a local source audio signal from a remote return audiosignal received at the local site, such as in the above-described livebroadcast, distance learning and conferencing scenarios. The sourceaudio signal and the return audio signal are digitally processed (e.g.,in a digital signal processor running a correlation routine) to detect atime delay and level difference of any echo of the source audio signalcontained in the return audio signal. An inverse of the source audiosignal adjusted according to the time delay and level difference is thenmixed with the return audio signal to suppress or cancel the echo fromthe return audio signal. The resulting echo-suppressed audio signal canthen be mixed with the source audio signal (not delayed) and played on amonitoring device (e.g., a set of headphones or speakers) forcomfortable listening, such as by a performer during a live broadcast.

[0010] According to one aspect of the invention, the echo suppressionhas a multiple state operating sequence that controls the audio signalsent to the monitoring device (e.g., headphones). The multiple stateoperating sequence accounts for a complex set of conditions, includingthat the source audio signal is not always “on the air” and that thecorrelation to detect time delay and level difference requires a finiteamount of time to process. For example, a broadcast program audio signalmay be sent to the monitoring device in an initial operating state whena live performer is “off the air.” When the performer goes “on the air,”the source audio signal without the return audio signal is played to themonitoring device during one or more states in which the correlation tothe echo is sought. Then, the echo-suppressed audio signal is played tothe monitoring device in states after the correlation to the echo isachieved. Finally, the echo suppressed audio signal also preferably isplayed for an interval approximately equal to the time delay after thesource audio signal again goes off the air.

[0011] According to another aspect of the invention, the source audiosignal is always sent to the monitoring device (i.e., the performeralways hears the signal originating from their microphone) so as toavoid intervals where the performer is unable to hear his or her ownvoice. Depending on the operating state, two other signals may be addedto the source audio signal at different times, which include the returnaudio signal and the echo suppressed audio signal (e.g., the broadcastprogram and the broadcast program with echo removed). Preferably, thevolume of the added signal is ramped up when adding the signal during astate switch for smoother audio transitions.

[0012] According to a further aspect of the invention, a voice activatedswitch (VOX) or like measure of activity on the source audio signalinitiates transitions between at least some of the operating states. Forexample, the VOX initiates a transition from an initial state where thereturn audio signal is sent to the monitoring device to one or morestates where only the source audio signal is sent to the monitoringdevice and during which correlation to the audio signal takes place.

[0013] According to yet another aspect of the invention, the sourceaudio signal is first correlated to the echo in the return audio signalat a reduced sample rate to define an approximate window (i.e., timeinterval) for a more exact correlation. This allows the amount of memoryand processing needed for the correlation to be reduced, while allowingthe correlation to be performed over a much larger time period.

[0014] In another aspect of the invention, values from a prior echocorrelation are retained for use in a subsequent correlation so as toprovide faster response in cases where the delay of the echo is likelyto remain the same.

[0015] In yet another aspect of the invention, a correlation in a narrowwindow (based on echo delay from a prior correlation) is runsimultaneously with a correlation at a reduced sample rate for a widerwindow. This allows the invention to detect echo over the wider window,while also providing the fast response in cases where the delay remainsthe same as the prior correlation.

[0016] The above features of the invention allow echo suppression in thespecific conditions present in the above live broadcast, distancelearning and conferencing scenarios. By defining a multiple stateoperating sequence under VOX control, the echo suppression according tothese aspects of the invention prevents the performer from hearing echowhile the correlation is being processed and as the source audio signalgoes on and off the air. Further, the dual correlation (at both reducedand full sample rates) allows the echo suppression to be done morequickly for echo at longer time delays.

[0017] Additional features and advantages of the invention will be madeapparent from the following detailed description of an illustratedembodiment, which proceeds with reference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0018]FIG. 1 is a block diagram of a multi-state echo suppressoraccording to an illustrated embodiment of the invention for use in alive broadcast geographically distant from a remote broadcast studio.

[0019]FIGS. 2 and 3 are block diagrams of an adaptive finite impulseresponse (FIR) filter used for echo correlation and suppression in theecho suppressor of FIG. 1.

[0020]FIG. 4 is a flow diagram of an adaptation process for the FIRfilter of FIGS. 2 and 3.

[0021]FIG. 5 is a state diagram of a multi-state operating sequence ofthe echo suppressor of FIG. 1.

[0022]FIG. 6 is a more detailed block diagram of the multi-state echosuppressor of FIG. 1, that shows a phase shifter, compressor, andlimiter in a local loop.

DETAILED DESCRIPTION OF THE ILLUSTRATED EMBODIMENTS

[0023] The present invention is directed toward multi-state echosuppression. In one embodiment illustrated herein, the invention isincorporated into an electronic device herein termed a multi-state echosuppressor 10. Briefly described, this product is used at a local siteto suppress delayed echo of a local microphone audio signal on a returnaudio signal from a geographically remote location to allownon-disruptive listening to the return audio signal while speaking onthe local microphone, such as by a live performer on a broadcastprogram, students of a distance learning program, or remote conferencingparticipants.

[0024] The multi-state echo suppressor 10 connects with four otherpieces of equipment at the local site, including a local audio source, areturn audio source, a monitoring device, and a transmitter. Forconnecting to this equipment, the multi-state echo suppressor includesstandard cable connectors or ports, e.g., microphone and headphonejacks, RCA-type cable connectors, etc.

[0025] Typically, the local audio source is a microphone 12 or likesound-to-electrical signal transducer that produces an audio signal (the“local audio signal”) from the performer's voice.

[0026] The return audio source can be a broadcast program receiver 14(e.g., a radio or television receiver) and a from-air antenna 16, orlike equipment (e.g., satellite dish) for receiving a remotely broadcastprogram, such as when the multi-state echo suppressor is used in a livebroadcast scenario. The return audio source also can be via othercommunications media, such as telephone, computer network, cable, opticfiber, and others.

[0027] The program monitoring device (for convenience, referred tohereafter as the headphones 18) is a set of headphones, speakers orother electrical signal-to-sound transducer, which is used by theperformer to listen to the return audio signal (e.g., the broadcastprogram in a live broadcast scenario).

[0028] The transmitter 20 is a transmitter used to transmit the localaudio signal to a remote production studio, broadcast station, or otherlocales of a distance learning or conferencing application. Thetransmitter 20 can be microwave, radio, telephone, modem or othertransmitter.

[0029] The multi-state echo suppressor 10 has two primary signal paths,a local loop 22 and a return loop 24. The local audio signal travels onthe local loop 22 from the microphone 12 to the transmitter 20. Thelocal audio signal passes through the local loop unchanged, except thatsome implementations of the multi-state echo suppressor 10 can providecompression or limiting as described more fully below. In someembodiments, the local loop 22 can be routed through the DSP 34(described below) to provide such compression or limiting. The returnloop 24 generally carries the return audio signal received on thefrom-air antenna 16 by the receiver 14 through the multi-state echosuppressor 10 to the headphones 18. In accordance with the invention,the return audio signal on its path through the return loop 24 has anyecho of the local audio signal suppressed.

[0030] Within the multi-state echo suppressor 10, analog-to-digitalconverters 30-31 digitize the local audio signal and return audio signalat the inputs of the local loop 22 and return loop 24 into digital datastreams. A digital signal processor (DSP) 34 having an echo correlatorand suppressor (hereafter “DSP echo correlator and suppressor”) residenttherein processes the digital data stream from each loop 22, 24. The DSPecho correlator and suppressor uses a multi-state operation sequencedescribed below to effect echo suppression in the signal monitored onthe headphones 18 according to the invention. A digital-to-analogconverter 37 converts the digital data stream at the output end of thereturn loop 24 back into analog audio.

[0031] The above-described circuitry in the multi-state echo suppressor10 can be implemented as one or more integrated circuit chips supportedon one or more printed circuit boards. Conventional, commerciallyavailable integrated circuits can be used for each of the converters30-31, 37 and DSP 34. In general, the DSP 34 includes a microprocessor(preferably optimized for processing digital data streams, such as audiosignals), non-volatile memory for storing operating code or instructionsincluding the echo correlator and suppressor, and volatile memory tostore data being processed. The circuitry is powered by an electricalpower supply, which may be a battery (or batteries) and power supplycircuit, an alternating current (A/C) converter, or other power supply.Further, the multi-state echo suppressor 10 is housed in an enclosure,which preferably is easily portable for use in mobile or in-fieldbroadcasting scenarios. For example, the multi-state echo suppressor canbe housed in an enclosure having a form factor that permits it to beworn or carried by the performer, such as strapped to the body or in apocket. Alternatively, the multi-state echo suppressor 10 can beintegrated with other equipment at the local site, such as a microphoneheadset, television camera, mobile transceivers, rack communicationsequipment, etc.

[0032] Echo Correlation and Suppression

[0033] With reference to FIGS. 2-4, the DSP echo correlator andsuppressor consists of various signal processing routines that identifya correlation (both delay time and level) of the local audio signal toits echo in the return audio signal, and cancel the echo from thereceived return audio signal. More particularly, the DSP echo correlatorand suppressor routines implement an adaptive transverse finite impulseresponse (FIR) filter 50 (FIGS. 2 and 3) and a stochastic gradient(least mean squared or LMS) adaptation process 92 (FIG. 4) thateffectively models the impulse response of the audio system as a whole(i.e., from local audio signal transmission to return audio signalreception). The multi-state echo suppressor 10 (FIG. 1) uses this FIRfilter 50 to generate an estimate of the local audio signal's echo inthe return audio signal, which the suppressor 10 then subtracts from thereturn audio signal to cancel the echo.

[0034] With reference to FIG. 2, the FIR filter 50 takes digitizedsamples of the local audio signal (“S(n)”), together with a set ofadaptively modified coefficients (“C(n)₁, C(n)₂, . . . C(n)_(k)”) as itsinputs. The FIR filter 50 shifts samples of the local audio signalthrough a set of stages 54-58, which effects a cumulative sample delay.At each sample time n, the FIR filter 50 shifts the current local audiosignal sample at an input 52 into an initial stage 54 of the filter, andshifts previous samples to subsequent stages 55-58 of the filter. Thestages thus produce a set of local audio signal samples at successivedelays (i.e., s(n−1), s(n−2), s(n−3), . . . ). The coefficients areinput to the filter at a set of filter taps 60-64. The FIR filter 50multiplies the samples at each stage 54-58 by respective coefficients atthe filter taps 60-64 as represented at multiplication operations 66-70.Finally, the FIR filter 50 sums these products in an addition operation72 to form a current sample of the estimated echo (“Ŝ_(e)(n)”) as thefilter's output 76.

[0035] With reference to FIGS. 3 and 4, the coefficients of the FIRfilter 50 (FIG. 2) are updated in a stochastic gradient LMS adaptationprocess 92 (FIG. 4) that adapts the FIR filter 50 to model the impulseresponse of the broadcast system. In an initial step 93, the process 92acquires and inputs a next local audio signal sample S(n) for thecurrent sample time into the FIR filter 50. The process 92 then uses theFIR filter 50 at step 94 to generate an estimate Ŝ_(e)(n) of the echofor the current sample time. At step 95, the process 92 acquires a nextreceived return audio signal sample R(n) for the current sample time.The process 92 then calculates the difference or error E(n) between thereceived return audio signal sample R(n) and the echo estimate Ŝ_(e)(n)at step 96.

[0036] Finally, at a step 97, the process 92 updates the coefficients asshown in FIG. 3. In each of the FIR filter's stages 54-58, the delayedlocal audio signal sample S(n−1), S(n−2), . . . S(n−k) at that stage ismultiplied in multiplication operations 80-84 by the product of theerror E(n) and an adaptation constant β (that determines the rate ofconvergence). The result is then added in addition operations 86-90 tothe current value of the respective coefficients C(n)₁, C(n)₂, . . .C(n)_(k). This yields the values of the coefficients C(n+1)₁, C(n+1)₂, .. . C(n+1)_(k) to be applied to the filter taps in the subsequent sampletime n+1.

[0037] In the illustrated multi-state echo suppressor 10 (FIG. 1), theadaptive FIR filter 50 uses a sample rate of approximately 9600 Hz and960 as the number k of coefficients. With these parameters, the adaptiveFIR filter can correlate and suppress echo within a delay time window ofapproximately 100 millisecond (ms) width. The adaptive FIR filterfurther uses an adaptation constant β in the range of 0.05 to 0.1. Theprocessing in the DSP 34 required for the correlation to converge on theecho within this 100 ms window takes approximately two to three seconds.In other embodiments, theses parameters can be varied to achieve adesired tradeoff in processing demands (or cost) and performance. Forexample, the 9600 Hz sampling rate and 960 coefficients are consideredto provide generally adequate echo suppression for voice communications,but a higher sampling rate and/or larger number of coefficients can beused where greater clarity is required.

[0038] Multi-State Operation

[0039] With reference now to FIG. 5, the routines of the DSP echocorrelator and suppressor (resident in DSP 34 of FIG. 1) are performedin a six state operation sequence with states 100-106. The states100-106 control which of the DSP echo correlator and suppressor'scorrelation routines are run on the DSP 34 (FIG. 1), and determine theaudio content of the signal that is output to the headphones 18 (FIG. 1)on the return loop 24. The six state operation sequence of the DSP echocorrelator and suppressor allow for the fact that the local audio signalfrom the microphone 12 (FIG. 1) is not always “on-air” (i.e., containedin the return audio signal) and allows time to correlate to and suppressecho from the return audio signal.

[0040] At initial power on (indicated at 110), the DSP echo correlatorand suppressor transitions from a “powered off” mode 100 to a resetstate 101. The echo correlator and suppressor remains in the reset state101 as long as no voice activity is detected on the local audio signalfrom the microphone 12 (FIG. 1). Since there is no microphone activityin the reset state 101, the DSP echo correlator and suppressor assumesthat there is no echo in the return audio signal. Consequently, the DSPecho correlator and suppressor causes the return audio signal to bepassed unchanged on the return loop 24 to the headphones 18 (FIG. 1).

[0041] When in the reset state 101, the DSP echo correlator andsuppressor also resets the FIR filter coefficients C(n)₁, C(n)₂, . . .C(n)_(k) to zero, preparatory to performing echo correlation in laterstates (e.g., states 102 and 103 as described below). In someembodiments, the DSP correlator and suppressor alternatively can resetthe coefficients to zero for only a first correlation to the echo in thereturn audio signal (e.g., after the initial power-on of the multi-stateecho suppressor 10). For subsequent echo correlations (e.g., on the samebroadcast program), the DSP echo correlator and suppressor can retainthe coefficients resulting from a prior correlation to echo in thereturn audio signal, so as to provide faster convergence in thesubsequent echo correlation.

[0042] The DSP echo correlator and suppressor uses a voice activatedswitch (VOX) to initiate a transition 111 to an on-air check state 102where echo correlation begins (as well as to trigger transitions betweenother states as described below). The VOX in the illustrated multi-stateecho suppressor 10 (FIG. 1) is implemented as a routine resident in theDSP 34, which detects activity (e.g., signal energy) in the local audiosignal greater than a threshold level. This routine is run continuously(i.e., on each microphone sample) or at frequent intervals by the DSPecho correlator and suppressor. The VOX produces VOX-on and VOX-offconditions according to the detected activity of the local audio signalon a fast attack and slow release time basis.

[0043] When the VOX signals a VOX-on condition (such as due to detectingthe performer's voice) while in the reset state 101, the DSP echocorrelator and suppressor goes to the on-air check state 102. Sincethere is now activity on the local audio signal, an echo of the localaudio signal may be present in the return audio signal if the performeris “on-the-air.” The DSP echo correlator and suppressor prevents theperformer from hearing such echo by causing the multi-state echosuppressor 10 to switch the output to the headphones 18 away from thereturn audio signal and over to the local audio signal alone. Theperformer thus hears the performer's own voice whenever the performerbegins speaking. Also, if the performer happens to be “on-the-air,” theperformer is prevented from hearing an echo of their voice in the returnaudio before the multi-state echo suppressor is able to correlate to andbegin suppressing the echo.

[0044] In an alternative embodiment of the invention, the multi-stateecho suppressor 10 passes the return audio signal to the headphones at areduced volume (e.g., attenuated by about −16 dB) while in the on-aircheck state 102 (and the correlate and wait state 103), so that audiocues to the performer in the return audio signal can still be perceivedbut echo is no longer distracting. In some such alternative embodiments,the level of attenuation is selectable by the user via a back panelswitch (e.g., with attenuation settings of −6 dB, −10 dB, −16 dB, . . ., and infinite), which can be part of the controls 136 in FIG. 6described below. At the infinite attenuation setting, the multi-stateecho suppressor 10 completely switches off the return audio signal, andoutputs only the local audio signal to the headphones 18 (FIG. 1).

[0045] In the on-air correlation check state 102, the DSP echocorrelator and suppressor runs an on-air check routine to check whetherany contribution from the local audio signal is present in the returnaudio signal (e.g., whether the performer is “on-the-air”). The on-aircheck routine uses the processing described above for the FIR filter 50(FIG. 2) and adaptation process 92 (FIGS. 3 and 4) to correlate thelocal audio signal to echo in the return audio signal. For the on-aircheck, the DSP echo correlator and suppressor runs the FIR filter 50 ata reduced sampling rate (as compared to the sampling rate used duringnormal correlation for the echo suppression in the run state 105described below). This allows the on-air check to correlate to the echomore quickly over a larger delay time window, but with reducedresolution.

[0046] The reduction in sample rate also allows a commensurate increasein filter length for a given DSP processing load. More particularly, theprocessing load (e.g., in MIPS) on the DSP is the limiting factor of theecho correlation and suppression process, and is generally proportionateto the product of the sample rate (F_(s) in Hz) and the filter length(n). Accordingly, the filter length can be increased by the same factorby which the sample rate is reduced to yield approximately a same DSPprocessing load. The increased filter length also increases the size ofthe delay time window, which size equals the quotient (n/F_(s)). Thus,for a factor (m) times reduction in sample rate together with the samefactor (m) increase in filter length, the window size is increased by(m²) times. Preferably, to simplify computation, the factor is aninteger. In the illustrated multi-state echo suppressor for example, thereduced sampling rate in the on-air check state is ⅓ of the fullsampling rate during echo suppression (e.g., a reduced sampling rate of3200 Hz for a full sampling rate of 9600 Hz), which allows a three timesincrease in the filter length at approximately the same DSP processingload. This allows an increase in delay time window size of about ninetimes (e.g., to 800 ms in the illustrated suppressor 10).

[0047] More particularly, when the FIR filter 50 is run during theon-air check, the adaptation process 92 modifies the coefficients C(n)₁,C(n)₂, . . . C(n)_(k) to match the system's impulse response. Thecoefficients produced by this processing generally represent an impulseresponse wave form of the system (i.e., the system including thetransmission path to the remote mixing site and return therefrom). Whenthe performer is on-the-air, the impulse response wave form appears as aslightly widened impulse that is displaced in time by the delay from thelocal audio signal to its echo in the return audio signal. Accordingly,the on-air correlation check determines that the performer is on-the-airwhen the wave form represented by the coefficients has this shape. Wherethe local audio signal is detected to be on-the-air, the displacement ofthe impulse in the impulse response wave form also indicates a delaytime of the local audio signal's echo in the return audio signal.

[0048] In other words, the FIR filter coefficients at the beginning ofthe on-air check are initially reset (e.g., in the reset state 101 asdescribed above). The coefficients will generally remain at a zero,except for a coefficient at the filter stage (or group of coefficientsat adjacent filter stages) corresponding to the echo's delay. Suchcoefficient is dynamically adjusted by the LMS adaptation process 92(FIG. 3) to a value indicative of the magnitude of the echo relative tothe local audio signal. The DSP correlator and suppressor determinesthat the impulse response has the displaced impulse shape (and thus theperformer is on-the-air) if any coefficient has a value indicating amagnitude that is at least a substantial fraction of the local audiosignal (e.g., at least ½ the magnitude of the local audio signal). Insome embodiments, the DSP correlator and suppressor may additionallycheck that other coefficients are approximately zero (e.g., coefficientsother than some tightly grouped or adjacent coefficients with valuesindicative of an echo magnitude at least a substantial fraction of thelocal audio signal).

[0049] In the illustrated multi-state echo suppressor, the FIR filterand adaptation processing performed in the state 102 for the on-aircheck is done at a reduced sample rate (as compared to the processing inthe correlate and wait state 103 described below). This reduced samplingrate correlation allows the DSP echo correlator and suppressor to detectcorrelation between the local audio signal and the return audio signalover a much longer period of time (window) for a given consumption ofmemory and processing resources of the DSP, but results in a moreapproximate estimate of the echo's delay time (e.g., to within about a100 millisecond window in the illustrated multi-state echo suppressor10). Since each correlation takes approximately 2-3 seconds for thecoefficients to converge sufficiently, the combination of the reducedsampling rate correlation with a second correlation at a higher samplerate (i.e., in state 103 described below) which correlates within the100 millisecond window, the echo correlator and suppressor can morequickly determine the echo's delay time when it is unknown to within alarge time period. Otherwise, each attempt to correlate the signalswithin each successive 100 millisecond window using the higher samplerate takes approximately two to three seconds to complete. So as toachieve savings with the combination of reduced and full sampling ratecorrelations over repeating full sampling rate correlations forsuccessive adjacent windows, the full sampling rate should be at leasttwice the reduced sampling rate (which results in window for the reducedsampling rate correlation that is at least twice the size of the windowin the full sampling rate correlation for an FIR filter having a givennumber of coefficients).

[0050] The echo correlator and suppressor remains in the on-air checkstate (continuing to run the on-air check) until the local audio signalis determined to be on-the-air, or until the VOX-off condition attains(e.g., when the performer stops speaking). From the on-air check state102, the echo correlator and suppressor makes a transition 112 to acorrelate and wait state 103 if the on-air correlation check determinesthat the local audio signal is on-the-air. If a VOX-off condition isinstead signaled, then a transition 113 is made back to the reset state101.

[0051] At the correlate and wait state 103, the echo correlator andsuppressor in the DSP 34 (FIG. 1) performs a correlation check againusing the adaptive FIR filter 50, but at the full sampling rate to moreaccurately determine the delay time and magnitude of the echo. Thischeck correlates the local audio signal and return audio signal within anarrower window (e.g., about 100 ms in the illustrated multi-state echosuppressor) centered on the delay time determined by the reducedsampling rate correlation performed for the on-air check in the on-aircheck state 102. More specifically, the check centers the narrowerwindow according to the delay time from the reduced sampling ratecorrelation by adding additional delay (hereafter “pre-filter delay”) tothe local audio signal samples before the first stage 54 (FIG. 2) of theFIR filter 50. The check then resets the coefficients to zero, and runsthe adaptive FIR filter 50 and adaptation process 92.

[0052] As described above, the correlation using the adaptive FIR filter50 at the full sampling rate takes approximately two to three seconds toconverge in the illustrated multi-state echo suppressor 10. Untilconvergence, the adaptive FIR filter 50 is not able to suppress the echoin the return audio signal. The DSP echo correlator and suppressorcontinues to prevent the performer from hearing echo while speakingduring the correlate and wait state 103, by causing the multi-state echosuppressor 10 to output the local audio signal to the headphones 18(FIG. 1) and not the return audio signal (or alternatively attenuatingthe return audio signal output to the headphones).

[0053] If the VOX-off condition is signaled (e.g., the performer stopsspeaking) during the correlation in the correlate and wait state 103,the echo correlator and suppressor makes a transition 114 to acorrelation pause state 104. In the correlation pause state 104, theecho correlator and suppressor cause the return audio signal to again bepassed to the headphones 18. Preferably, the slow release time of theVOX or a slow volume ramp-up of the return audio signal prevents theperformer from hearing a last echo of the performer's voice when thereturn audio signal is again passed to the headphones 18 in this stateafter the performer ceases speaking. When the VOX-on condition is againsignaled as indicated by a transition 115, the echo correlator andsuppressor returns the correlate and wait state 103 where correlation tothe echo is re-attempted. During the pause state 104, the DSP echocorrelator and suppressor stops running the adaptive filter 50 andadaptation process 92, but can retain the coefficients produced to thatpoint so as to allow the correlation to complete more quickly whenresumed.

[0054] When a satisfactory correlation to the echo is achieved in thecorrelate and wait state 103 (typically after about two to three secondsof a continuous VOX-on condition while on-the-air in the illustratedecho suppressor 10), the echo correlator and suppressor makes atransition 116 to a run state 105. In the illustrated multi-state echosuppressor, the condition for satisfactory correlation in thecorrelation check of state 103 is similar to that for the on-air checkin state 102. More particularly, the DSP echo correlator and suppressordetermines that a satisfactory correlation has been obtained when acoefficient has reached a threshold value indicating an echo magnitudethat is substantial fraction (e.g., one half) of the local audio signal.

[0055] In the run state 105, the DSP echo correlator and suppressorbegins suppressing the echo in the return audio signal. The echocorrelator and suppressor in the DSP 34 (FIG. 1) subtracts the echoestimate Ŝ_(e)(n) produced by the adaptive FIR filter 50 (FIG. 2) fromthe return audio signal data stream. This effectively cancels the echo,resulting in an echo suppressed return audio signal. The DSP echocorrelator and suppressor causes the multi-state echo suppressor 10 toswitch to outputting the echo suppressed return audio signal to theheadphones 18 (FIG. 1).

[0056] While in the run state 105, the echo suppressor and correlatormakes a transition 117 to and a transition 118 back from a pause state106 in response to the VOX-off and VOX-on conditions, respectively. Inthe pause state 106, the echo suppressor and correlator retains thecorrelation coefficients determined in the correlate and wait statesindefinitely on the assumption that the VOX-on condition will resumemomentarily. The echo suppressor and correlator also ceases to suppressthe echo, allowing the return audio signal to pass unchanged to theheadphones 18. When returned to the run state 105, the DSP echocorrelator and suppressor restarts the adaptive FIR filter 50 with theretained coefficients, which allows the multi-state echo suppressor 10to immediately resume echo suppression (i.e., without waiting for a newcorrelation).

[0057] The echo suppressor and correlator goes back to the reset state101 from the correlate and wait state 103 or run state 105 upondetecting that the local audio signal is no longer present in the returnaudio signal (i.e., the performer is “off-the-air”). In the illustratedmulti-state echo suppressor 10 (FIG. 1), the determination that thelocal audio signal is off-the-air is made when the FIR filtercoefficients all drop below a value indicative of an echo magnitude thatis a preset fraction of the local audio signal (where the presetfraction is less than the substantial fraction used for the on-air andcorrelation checks).

[0058] Built-in Microphone Pre-Amplifier

[0059]FIG. 6 shows the multi-state echo suppressor 10 (also shown inFIG. 1) in more detail. The illustrated multi-state echo suppressor 10includes an amplifier 130 connected at a local input 132 in the localloop 22 (where the microphone 12 connects to the multi-state echosuppressor). The amplifier 130 acts as a built-in microphonepre-amplifier. The multi-state echo suppressor 10 has a set of controls136, which include a switch to select between unity gain andpre-amplification. This switch sets a local input level signal thatcontrols the amount of amplification of the amplifier 130. With theswitch in the unity gain setting, the multi-state echo suppressor 10provides unity gain in the local loop 22 and is useable for a mode ofoperation (the “mic-in/mic-out mode”) which accepts a microphone inputlevel signal (such as, at about −57 to −62 dBm) at the local input 132and outputs a microphone output level signal at a local output 133(i.e., the output to the transmitter 20). The unity gain setting also isapplicable to a mode of operation (the “line-in/line-out mode”) with aline-in level signal at both local input 132 and local output 133.

[0060] On the other hand, with the switch in the pre-amplificationsetting, the multi-state echo suppressor 10 provides a mode of operation(the “mic-in/line-out mode”) which accepts a microphone input levelsignal at the local input 132, and amplifies the local audio signal inthe pre-amplifier 130 to a line-out level signal at the local output133. The pre-amplifier 130 has an adjustable output level that can beset based on an external reference. Phantom power (about 48 VDC and 1mA) is selectable. The pre-amplifier 130 provides approximately 50 dBgain for a line-in level local audio signal on the local loop 22 from amicrophone input level signal.

[0061] The multi-state echo suppressor 10 also has amplifiers 136-138 ateach of the local output 133 (e.g., to headphones 18), return input 134,and output 135 (e.g., to transmitter 20). These amplifiers allowadjustment of the signal levels at these outputs and input, such as 0dBm, 4 dBm, etc. The amplification is controlled by input level signalsapplied to the amplifiers that can be set by the user via the controls136.

[0062] Audio Compressor, Limiter, and Phase Shifter

[0063] With reference still to FIG. 6, the performance of themulti-state echo suppressor 10 is optimized when the local audio signalscales linearly to its echo in the return audio signal. Any non-linearvariations that are introduced into the local audio signal after thelocal loop 22 (i.e., between its output from the local output 133 andits echo in the return audio signal) will adversely affect the echocorrelation and suppression. This results in anomalies or noise beingadded to the echo suppressed return audio signal that is output at theoutput 135 on the return loop 24. The non-linearities can result whenthe local audio signal is modified by compression and/or limiting (i.e.,clipping the signal to a desired range) after the local loop 22, such asin transmission equipment or at the remote studio.

[0064] So that any desired compression and limiting of the local audiosignal is performed before the echo correlation and suppression, themulti-state echo suppressor 10 preferably provides a built-in compressor140 and a limiter 142. The compressor 140 and limiter 142 are located inthe local loop 22 before a dual codec 146, which provides theanalog-to-digital and digital-to-analog conversion shown at 30-31 and 37in FIG. 1. The compression ratio of the compressor is controlled by acompression ratio value applied at an input of the compressor 140. Acompression switch 144 can be set by the user to selectively shut-offthe compressor 140, if compression is not desired. The limiter 142 clipsthe local audio signal to a suitable operating range for the audiosystem. The illustrated multi-state echo suppressor 10 also includes alimiter 152 in the return loop 24 for limiting the return audio signalto the input range of the dual codec 146.

[0065] The illustrated multi-state echo suppressor 10 also includes aphase shifter 148 in the local loop 22 to further improve echocorrelation and suppression. Typically, a performer's voice produces anaudio signal that is asymmetrical about the zero level axis. The phaseshifter 148 is a phase chasing circuit that improves the symmetry of theaudio signal about the axis prior to the echo correlation andsuppression via the codec 146 and the DSP 34. The improved symmetry ofthe local audio signal results in the echo correlation and suppressionprocess being better able to match and remove the echo in the returnaudio signal. The multi-state echo suppressor 10 includes a switch 150for selectively activating or deactivating the phase shifter 148.

[0066] Voice Reinforcement

[0067] With reference still to FIG. 6, the illustrated multi-state echosuppressor enhances the perceived echo suppression using voicereinforcement. With the voice reinforcement, the local microphone audiosignal (e.g., of the performer's voice) is enhanced relative to the echosuppressed return audio signal when output at the output 135 (FIG. 6) tothe headphones 18 (FIG. 1). The voice reinforcement can be performed inthe run state 105 (FIG. 5) when the local microphone audio signal iscombined with the echo suppressed return audio signal at the output, oralternatively in all states in which the local microphone audio signalis output to the headphones 18.

[0068] Upon correlation, the DSP 34 determines the level of the echo inthe return audio signal, and causes the codec 146 to scale the localaudio signal by an appropriate multiplier relating to the ratio of thislevel to that of the local audio signal. Without voice reinforcement,the codec 146 simply scales the local audio signal to match the returnaudio signal's level. With voice reinforcement, the codec 146 scales thelocal audio signal to a slightly louder level than the return audiosignal, such as about 9 dB above the return audio signal's level. Thescaled local audio signal is then combined with the echo suppressedreturn audio signal for output to the headphones 18 (FIG. 1) at theoutput 135 (FIG. 6). This voice reinforcement serves to mask thesuppressed echo in the output audio. In other words, the voicereinforcement creates the perception of greater echo suppression orattenuation.

[0069] In some embodiments of the invention, a switch is provided in thecontrols 136 (FIG. 6) to select between activation and deactivation ofvoice reinforcement. The controls 136 also can provide a slider, dial orlike control to allow user adjustment of the amount of voicereinforcement, i.e., the amount of scaling of the local audio signalabove the return audio signal's level when voice reinforcement isselected.

[0070] The illustrated multi-state echo suppressor 10 also includes bargraph generators 156-157 with LEDs that provide a local level meter andreturn level meter display, for visually monitoring the levels of therespective local and return audio signals. An LED array 158 (of 2×16LEDs) indicates the status of various user settings (selected via thecontrols 136) and status of the multi-state echo suppressor.

[0071] Fast Response Echo Correlation

[0072] Referring again to the state diagram of FIG. 5, the multi-stateecho suppressor 10 features a fast response echo correlation routine onthe DSP 34, which is run in response to the VOX-ON transition 111 fromthe reset state 101 of the multi-state operating sequence. For use inthe fast response echo correlation, the DSP 34 stores parameters of oneor more known, prior full sample rate correlations. These parametersinclude the pre-filter delay, and the one or more FIR filtercoefficients (e.g., by filter stage and value) that have valuesindicative of echo that is a substantial fraction of the local audiosignal (i.e., the zero and near-zero value FIR filter coefficientstypically represent noise in the return audio signal, and need not bestored). The prior full sample rate correlation can simply be the last(i.e., immediately preceding) full-rate correlation obtained by themulti-state echo suppressor 10. Alternatively, the multi-state echosuppressor 10 can store parameters of a plurality of prior full-ratecorrelations as “correlation presets” that are selectable by the uservia radio buttons that are part of the controls 136 shown in FIG. 6(similar to channel presets on an FM radio tuner). The parameters of theprior full-rate correlation is stored during the prior iteration throughthe multi-state operation sequence immediately upon the correlationbeing achieved (e.g., on transition to the run state 105).

[0073] In the fast response correlation routine (begun on the VOX-ONtransition 111), the DSP 34 simultaneously runs two separate correlationoperations. In a first of these simultaneous correlation operations, theDSP 34 runs the normal correlation checks of the on-air check state 102and correlate-and-wait state 103, as described above for the Multi-StateOperation. Specifically, a new reduced-rate correlation with a widewindow (e.g., about 250 ms) is performed in the on-air check state 102,which is followed by a new full-rate correlation in thecorrelate-and-wait state 103 if the reduced-rate correlation results inthe on-air condition 112. These new correlations are begun with all FIRfilter coefficients reset to zero. The pre-filter delay for thefull-rate correlation is based on the results of the reduced-ratecorrelation.

[0074] In the second simultaneous correlation operation, the DSP 34 runsa full-rate correlation of the correlate-and-wait state 103 alone, whichis based on the stored parameters (e.g., of a user-selected correlationpreset). The pre-filter delay of this full-rate correlation is set tothat of the stored, prior full-rate correlation. The FIR filtercoefficients also are initially set according to those of the stored,prior full-rate correlation. So that correlation is not immediatelysignaled before the adaptation process has the opportunity to detectactual echo in the return audio signal, the FIR filter coefficients canbe appropriately scaled down from those of the prior full-ratecorrelation (i.e., scaled below the correlation threshold indicative ofecho at a substantial fraction of the local audio signal). Also, thenear-zero valued FIR filter coefficients (which typically correspond tonoise in the return audio signal during the prior full-rate correlation)are initially reset to zero.

[0075] The adaptation process 92 of FIG. 4 is then run on each of thefirst and second correlation operations simultaneously, until one of thecorrelation operations produces the correlation condition 116 at thefull sample rate. The successful correlation (with its pre-filter delay)is then used for echo suppression in the run state 105.

[0076] The fast response correlation routine typically provides a fasterresponse where the delay of the echo is known or expected to remainconstant, and additionally protects against a failure to correlate whenthe delay deviates outside the expected “window.” The second correlationoperation (based on the stored correlation preset) results in a fastresponse echo suppression where the delay remains consistent with theprior correlation. For example, compared to the normal operation wherethe delay is unknown (which can take the time of several spokensyllables), the second correlation can complete in the duration of partof a spoken syllable. The first correlation operation simultaneouslyprovides a failsafe by checking for echo outside the second correlationoperation's narrow window.

[0077] The illustrated multi-state echo suppressor 10 is equipped with acorrelation mode select switch (part of the controls 136 of FIG. 6, suchas a slide switch located on the housing of the echo suppressor 10),which allows selection between a normal correlation operation (asdescribed in the description of the Multi-State Operation above) or thefast response echo correlation routine based on correlation presets.

[0078] In some alternative embodiments of the multi-state echosuppressor, the DSP 34 lacks a fast response correlation routine thatsimultaneously performs the first and second operations. The correlationmode select switch instead selects between a full correlation mode (withthe normal correlation operation) for use when the delay is unknown, anda narrow search mode for use when the delay is known. The narrow searchmode performs only the full-rate correlation of the second correlationoperation described above (without the simultaneous “failsafe” firstcorrelation operation).

[0079] Audio Console/Remote Return Input Switching

[0080] In some embodiments of the illustrated multi-state echosuppressor 10, the multi-state echo suppressor includes two inputs forthe return loop 24. One of these inputs is for connecting a remotereturn signal (such as a broadcast program received from-air). The otherinput connects to an audio console at the local site for receiving areturn of the local audio signal immediately prior to transmission, andafter any local processing of the signal. The echo suppressor includes aswitch for selecting which of the signals is connected in the returnloop, so as to allow the user to select which of the two signals tomonitor (without echo).

[0081] Having described and illustrated the principles of our inventionwith reference to an illustrated embodiment, it will be recognized thatthe illustrated embodiment can be modified in arrangement and detailwithout departing from such principles. For example, it should beunderstood that the programs, processes, or methods described herein arenot related or limited to any particular type of signal processingapparatus, unless indicated otherwise. Various types of general purposeor specialized apparatus may be used with or perform operations inaccordance with the teachings described herein. Elements of theillustrated embodiment shown in software may be implemented in hardwareand vice versa.

[0082] In view of the many possible embodiments to which the principlesof our invention may be applied, it should be recognized that thedetailed embodiments are illustrative only and should not be taken aslimiting the scope of our invention. Rather, we claim as our inventionall such embodiments as may come within the scope and spirit of thefollowing claims and equivalents thereto.

1. An echo suppressor in an audio signal transmission system in which an audio signal originating at a local site is transmitted on the system and a return audio signal that potentially contains echo of the local audio signal is received from the system, the echo suppressor operating at the local site and comprising: a first input for said locally originating audio signal; a second input for said return audio signal received at the local site; an output for audio monitoring at the local site; a signal processor operating to process said locally originating audio signal and said return audio signal for correlating said locally originating audio signal to said echo in said return audio signal and suppressing said echo to produce an echo suppressed return audio signal, and to provide the echo suppressed return audio signal at the output for monitoring at the local site; and a voice activated switch operating in response to voice activity in the locally originating audio signal to switch from providing said return audio signal at the output to providing said locally originating audio signal at the output until correlation is achieved.
 2. An echo suppressor in an audio signal transmission system in which an audio signal originating at a local site is transmitted on the system and a return audio signal that potentially contains an echo of the locally originating audio signal is received from the system at the local site, the echo suppressor operating at the local site and comprising: a first input for said locally originating audio signal; a second input for said return audio signal received at the local site; an output for audio monitoring at the local site; and a signal processor operating to process said locally originating audio signal and said return audio signal for correlating said locally originating audio signal to said echo in said return audio signal and suppressing said echo to produce an echo suppressed return audio signal, and to provide the echo suppressed return audio signal at the output for monitoring at the local site; the signal processor further operating to initially perform a reduced sampling rate correlation of said locally originating audio signal to said echo in said return audio signal to identify an approximate delay of said echo from said locally originating audio signal within a wider window of time, and to perform a full sampling rate correlation of said locally originating audio signal to said echo in said return audio signal within a narrower window of time defined according to the approximate delay for use in suppressing said echo from said return audio signal.
 3. The echo suppressor of claim 2 wherein the signal processor additionally operates to store parameters of a prior correlation, and to perform a second full sampling rate correlation based on the stored parameters concurrently with the reduced sampling rate correlation, so as to more quickly correlate said locally originating audio signal to said echo when the audio signal transmission system causes a consistent delay of said echo.
 4. The echo suppressor of claim 3 further comprising: data storage for retaining parameters of a plurality of prior correlations of said locally originating audio signal to said echo as plural correlation presets; controls for actuating by a user to select from the correlation presets; and the signal processor being operative to perform the second full sampling rate correlation based on the parameters of the selected correlation preset.
 5. The echo suppressor of claim 3 further comprising: data storage for retaining parameters of an immediately preceding successful correlation of said locally originating audio signal to said echo; and the signal processor being operative to perform the second full sampling rate correlation based on the retained parameters of the immediately preceding successful correlation.
 6. An echo suppressor in an audio signal transmission system in which an audio signal originating at a local site is transmitted on the system and a return audio signal that potentially contains an echo of the locally originating audio signal is received from the system at the local site, the echo suppressor operating at the local site and comprising: a first input for said locally originating audio signal; a second input for said return audio signal received at the local site; an output for audio monitoring at the local site; a signal processor operating to process said locally originating audio signal and said return audio signal for correlating said locally originating audio signal to said echo in said return audio signal and suppressing said echo to produce an echo suppressed return audio signal, and to provide the echo suppressed return audio signal at the output for monitoring at the local site; the signal processor further operating to simultaneously perform a reduced sampling rate correlation of said locally originating audio signal to said echo in said return audio signal to identify an approximate delay of said echo from said locally originating audio signal within a wider window of time, and to perform a full sampling rate correlation of said locally originating audio signal to said echo in said return audio signal within a narrower window of time, the full sampling rate correlation being preset according to stored parameters of a prior successful correlation of said locally originating audio signal to said echo in said return audio signal.
 7. A method of suppressing echo of a locally originating audio signal from a return transmission audio signal, the method comprising the steps of: detecting voice activity in the locally originating audio signal; processing the locally originating audio signal and return transmission audio signal according to a multi-state operating sequence having at least one correlate state during which the processing correlates the locally originating audio signal to echo in the return transmission audio signal, and an echo suppress state in which the processing suppresses echo in the return transmission audio signal to produce an echo suppressed audio signal; selectively switching among at least the locally originating audio signal and the echo suppressed audio signal for output to a monitoring device depending on a current state in the multi-state operating sequence; causing a transition to a first of the at least one correlate state in response to the presence of voice activity in the locally originating audio signal being detected; and switching to output of the locally originating audio signal in the first of the at least one correlate state.
 8. The method of claim 7 further comprising the step of: switching to output of the echo suppressed audio signal in the echo suppress state.
 9. The method of claim 8 further comprising the step of: switching to output of a combination of the locally originating audio signal and the echo suppressed audio signal in the echo suppress state.
 10. The method of claim 9 further comprising the step of: reinforcing the amplitude of the locally originating audio signal relative to said echo suppressed audio signal in said combination to thereby further mask said echo.
 11. The method of claim 10 wherein the amplitude of the locally originating audio signal is reinforced according to a user-selectable setting.
 12. The method of claim 9 further comprising the step of: matching the amplitude of the locally originating audio signal to that of said echo suppressed audio signal in said combination.
 13. The method of claim 7 further comprising the step of: switching to output of the return transmission audio program in an initial state prior to the at least one correlate state.
 14. The method of claim 13 further comprising the step of: switching to output of a combination of the locally originating audio signal and the return transmission audio signal in the initial state.
 15. The method of claim 7 wherein the step of selectively switching comprises the steps of: continuously passing the locally originating audio signal for output to the monitoring device; selectively adding the echo suppressed audio signal to the locally originating audio signal for output to the monitoring device depending on the current state in the sequence.
 16. The method of claim 15 further comprising the step of: amplifying the added locally originating audio signal relative to said echo suppressed audio signal for said output to the monitoring device to thereby further mask said echo.
 17. The method of claim 16 wherein the added locally originating audio signal is amplified by a user-adjustable gain ratio relative to the echo suppressed audio signal.
 18. The method of claim 15 further comprising the steps of: adding the return transmission audio signal to the locally originating audio signal in an initial state prior to the at least one correlate state for output to the monitoring device; passing only the locally originating audio signal for output to the monitoring device in the at least one correlate state; and adding the echo suppressed audio signal to the locally originating audio signal for output to the monitoring device in the echo suppress state.
 19. The method of claim 15 further comprising the steps of: attenuating the return transmission audio signal; and adding the attenuated, return transmission audio signal to the locally originating audio signal for output to the monitoring device in the at least one correlate state.
 20. The method of claim 7 further comprising the steps of: in the first of the at least one correlate state, performing a reduced sampling rate correlation of the locally originating audio signal to the echo in the return transmission audio signal to determine whether the locally originating audio signal is on-the-air and to approximately determine delay of the echo from the locally originating audio signal within a wider window of time; causing a transition to a second of the at least one correlate state in response to determining that the locally originating audio signal is on-the-air in the first of the at least one correlate state; and in the second of the at least one correlate state, performing a first full sampling rate correlation of said locally originating audio signal to said echo in said return transmission audio signal within a narrower window of time defined according to the approximate delay so as to more exactly converge on the delay of the echo from the locally originating audio signal for use in suppressing said echo from said return transmission audio signal.
 21. The method of claim 20 further comprising the steps of: simultaneous with at least one the steps of performing the reduced sampling rate correlation within the wider window and performing the first full sampling rate correlation within the narrower window, separately performing a second full sampling rate correlation of the locally originating audio signal to the echo in the return transmission audio signal, the second full sampling rate correlation being preset using stored parameters of a prior successful correlation.
 22. A method of suppressing echo of a locally generated audio signal from a return transmission audio signal, comprising the steps of: detecting activity on the locally generated audio signal; in response to the presence of activity on the locally generated audio signal, processing the locally generated audio signal in relation to the return transmission audio signal using a finite impulse response filter having coefficients adapted according to a stochastic gradient adaptation process at a first sampling rate to detect the presence of the echo of the locally generated audio signal in the return transmission audio signal at an approximate delay time within a first window of time; in response to the detection of the echo, processing the locally generated audio signal in relation to the return transmission audio signal using the finite impulse response filter at a second sampling rate to correlate the locally generated audio signal to its echo in the return transmission audio signal within a second window of time displaced according to the approximate delay time, the second sampling rate being greater than twice the first sampling rate so that the first window is wider than the second window; and after achieving correlation, suppressing the echo of the locally generated audio signal from the return transmission audio signal using the finite impulse response filter.
 23. The method of claim 22 further comprising the step of: passing the return transmission audio signal to an output for monitoring; and attenuating the return transmission audio signal at the output during the steps of processing to detect echo and of processing to correlate to the echo.
 24. The method of claim 22 further comprising the step of: providing output for monitoring the return transmission audio signal; switching output to the locally generated audio signal during the steps of processing to detect echo and of processing to correlate to the echo.
 25. The method of claim 22 further comprising the steps of: pausing the processing using the finite impulse response filter to detect echo if activity is no longer detected on the locally generated audio signal; retaining while paused the coefficients of the finite impulse response filter thus far adapted during the processing; and resuming the processing to detect echo with the retained coefficients in response to again detecting the presence of activity on the locally generated audio signal.
 26. The method of claim 22 further comprising the steps of: pausing the processing using the finite impulse response filter to correlate to the echo if activity is no longer detected on the locally generated audio signal; retaining while paused the coefficients of the finite impulse response filter thus far adapted during the processing; and resuming the processing to correlate to the echo with the retained coefficients in response to again detecting the presence of activity on the locally generated audio signal.
 27. The method of claim 22 further comprising the steps of: pausing the suppressing of the echo using the finite impulse response filter if activity is no longer detected on the locally generated audio signal; retaining while paused the coefficients of the finite impulse response filter thus far adapted during the processing; and resuming the suppressing of the echo using the finite impulse response filter in response to again detecting the presence of activity on the locally generated audio signal.
 28. The method of claim 22 further comprising the step of: determining that correlation is achieved during processing to correlate to the echo if at least one of the coefficients is adapted to a value indicating the echo is greater than a first fraction of the locally generated audio signal.
 29. The method of claim 28 further comprising the steps of: continuing to adapt the coefficients of the finite impulse response filter during suppressing the echo; and repeating the method of claim 22 if the coefficients decrease to less than a second fraction of the locally generated audio signal, the second fraction being less than the first fraction.
 30. The method of claim 29 further comprising the step of retaining the coefficients of the finite impulse response filter from a prior iteration of the method of claim 22 for use in a subsequent iteration of the method of claim
 22. 31. An echo suppressor comprising: a first input for a local signal; a second input for a return signal; and a signal processor for processing the local signal and the return signal using a finite impulse response filter having a plurality of adaptively adjustable coefficients to correlate the local signal to an echo of the local signal in the return signal, the signal processor operating in a first mode to initially process the signals using the finite impulse response filter at a reduced sample rate to detect the echo to an approximate delay within an expanded window, and in a second mode to process the signals using the finite impulse response filter at an increased sample rate to correlate to the echo within a narrower window about the approximate delay, whereby the echo suppressor can more quickly correlate to echo over a wide range of delays.
 32. The echo suppressor of claim 31 further comprising: a voice activated switch for initiating the first mode upon detection of activity in the local signal; and a correlation detector for initiating the second mode upon the coefficients attaining values indicating echo greater than a predetermined fraction of the local signal.
 33. The echo suppressor of claim 32 further comprising: a signal output; and an output switcher operating in response to the voice activated switch and the correlation detector for switching from passing the return signal to the signal output over to passing the local signal to the signal output while the signal processor operates in the first and second modes until correlation is achieved. 